Picture for Hexin Liu

Hexin Liu

The WER Trap: Shattering the Illusion of Unified Tokens in Speech Language Models

Add code
May 28, 2026
Viaarxiv icon

DuplexSLA: A Full-Duplex Spoken Language Model with Synchronized Speech, Language, and Action

Add code
May 20, 2026
Viaarxiv icon

Evaluating the Expressive Appropriateness of Speech in Rich Contexts

Add code
May 10, 2026
Viaarxiv icon

Step-Audio-R1.5 Technical Report

Add code
Apr 28, 2026
Viaarxiv icon

The Silent Thought: Modeling Internal Cognition in Full-Duplex Spoken Dialogue Models via Latent Reasoning

Add code
Mar 18, 2026
Viaarxiv icon

LLM-ForcedAligner: A Non-Autoregressive and Accurate LLM-Based Forced Aligner for Multilingual and Long-Form Speech

Add code
Jan 26, 2026
Viaarxiv icon

The ICASSP 2026 Automatic Song Aesthetics Evaluation Challenge

Add code
Jan 12, 2026
Viaarxiv icon

Improving Code-Switching Speech Recognition with TTS Data Augmentation

Add code
Jan 02, 2026
Viaarxiv icon

Mind-Paced Speaking: A Dual-Brain Approach to Real-Time Reasoning in Spoken Language Models

Add code
Oct 10, 2025
Viaarxiv icon

Bi-directional Context-Enhanced Speech Large Language Models for Multilingual Conversational ASR

Add code
Jun 16, 2025
Viaarxiv icon